disagreement probability
- Asia > India (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > India (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
- Asia > India (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
Mitigating Spurious Correlations via Disagreement Probability
Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples--those without spurious correlations--and upsamples them according to the disagreement probability.
Med-CoDE: Medical Critique based Disagreement Evaluation Framework
Gupta, Mohit, Aizawa, Akiko, Shah, Rajiv Ratn
The emergence of large language models (LLMs) has significantly influenced numerous fields, including healthcare, by enhancing the capabilities of automated systems to process and generate human-like text. However, despite their advancements, the reliability and accuracy of LLMs in medical contexts remain critical concerns. Current evaluation methods often lack robustness and fail to provide a comprehensive assessment of LLM performance, leading to potential risks in clinical settings. In this work, we propose Med-CoDE, a specifically designed evaluation framework for medical LLMs to address these challenges. The framework leverages a critique-based approach to quantitatively measure the degree of disagreement between model-generated responses and established medical ground truths. This framework captures both accuracy and reliability in medical settings. The proposed evaluation framework aims to fill the existing gap in LLM assessment by offering a systematic method to evaluate the quality and trustworthiness of medical LLMs. Through extensive experiments and case studies, we illustrate the practicality of our framework in providing a comprehensive and reliable evaluation of medical LLMs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (8 more...)
Mitigating Spurious Correlations via Disagreement Probability
Han, Hyeonggeun, Kim, Sehwan, Joo, Hyungjun, Hong, Sangwoo, Lee, Jungwoo
Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.
Cooperative Graphical Models Stefanie Jegelka Dept. of Computer Science, ETH Zürich
We study a rich family of distributions that capture variable interactions significantly more expressive than those representable with low-treewidth or pairwise graphical models, or log-supermodular models. We call these cooperative graphical models. Yet, this family retains structure, which we carefully exploit for efficient inference techniques. Our algorithms combine the polyhedral structure of submodular functions in new ways with variational inference methods to obtain both lower and upper bounds on the partition function. While our fully convex upper bound is minimized as an SDP or via tree-reweighted belief propagation, our lower bound is tightened via belief propagation or mean-field algorithms. The resulting algorithms are easy to implement and, as our experiments show, effectively obtain good bounds and marginals for synthetic and real-world examples.
- Europe > Switzerland > Zürich > Zürich (0.40)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)